Arabidopsis thaliana gene model alignments

Sequences were aligned to the Physomitrella patens genome using the CAT program (Li et al. 2007) with default parameters and similarity matrix.
To remove poor quality alignments a number of filters were applied: 1) alignment segments with a negative CAT value for matched bases were removed, 2) Alignments with an overall alignment percentage below 30% were removed (overall% =percent match ID * percent match length), 3) Where sequences mapped to multiple locations, only alignments 70% similar to the match with the best overall% were retained, 4) alignment segments separated by >3Kb were evaluated as distinct groups with only the alignment part with the highest overall% retained.